Skip to content

system store - reduce load time (part of performance effort 4.20) #9096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

alphaprinz
Copy link
Contributor

@alphaprinz alphaprinz commented Jun 12, 2025

Describe the Problem

Reduce system_store.load() cpu load.

Explain the Changes

  1. REMOVED - Introduce "fresh enough" - if load was performed recently, don't load again.
  2. Perform db queries concurrently (instead of sequentially).
  3. REMOVED - Reduce sempahore's block
  4. Remove publish changes from hb and md_aggregator.

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

  • Doc added/updated
  • Tests added

Summary by CodeRabbit

Summary by CodeRabbit

Here are the updated release notes:

  • New Features
    • Added a new configuration option to control concurrency during database data loading.
    • Introduced a new API endpoint to retrieve system store data.
    • Enabled selection between database and core system sources for system data loading.
  • Improvements
    • Optimized data loading by limiting parallel database reads based on configurable concurrency.
    • Enhanced system store refresh with detailed debug logging and conditional cluster-wide notifications.
    • Improved control over system updates by allowing suppression of cluster-wide notifications.

Copy link

coderabbitai bot commented Jun 12, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The changes introduce a new configuration constant SYSTEM_STORE_LOAD_CONCURRENCY to limit concurrency during database reads in the SystemStore class. The SystemStore now supports two data sources, DB and CORE, selectable via environment. The refresh() method stores and returns its result instead of returning early. Calls to make_changes() in background services now include an optional publish parameter to control cluster-wide notifications. A new API method and server function expose system store data. Schema strictification is disabled. A JSON decode utility is exported.

Changes

Files/Groups Change Summary
config.js Added SYSTEM_STORE_LOAD_CONCURRENCY constant from environment with default 5; reduced max DB clients.
src/server/system_services/system_store.js Added SOURCE enum and source property; support for CORE and DB sources; concurrency-limited DB reads using SYSTEM_STORE_LOAD_CONCURRENCY; updated refresh() to store and return result; added recent_db_data() method; added optional publish parameter to make_changes().
src/server/bg_services/cluster_hb.js Added explicit true as second argument to system_store.make_changes() call in heartbeat.
src/server/bg_services/md_aggregator.js Added explicit false as second argument to make_changes() calls to suppress notifications.
src/api/system_api.js Added new GET API method get_system_store with no system auth required.
src/server/system_services/system_server.js Added async function get_system_store() returning recent DB data from system store; exported it.
src/endpoint/endpoint.js Changed system_store.get_instance() call to pass {source: system_store.SOURCE.CORE} when local metadata server.
src/rpc/rpc_schema.js Disabled strict schema validation enforcement by commenting out strictification code in register_api().
src/util/postgres_client.js Exported decode_json function alongside existing exports.

Sequence Diagram(s)

sequenceDiagram
    participant SystemStore
    participant Database
    participant CoreRPC

    alt source = DB
        SystemStore->>Database: Read new data (limit concurrency by SYSTEM_STORE_LOAD_CONCURRENCY)
        Database-->>SystemStore: Return data
    else source = CORE
        SystemStore->>CoreRPC: Fetch system store data via RPC
        CoreRPC-->>SystemStore: Return data
    end
    SystemStore->>SystemStore: Store and return refresh result
Loading

This diagram illustrates SystemStore loading data conditionally from either the database with concurrency limits or from the core system RPC source, then storing and returning the refreshed data.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/server/system_services/system_store.js (1)

439-444: Minor: expensive debug payload serialisation

JSON.stringify(new_data) and util.inspect on large datasets allocate tens of MB and block the event-loop even when log level is low.
Consider wrapping both calls with the same dbg.should_log(1) guard to avoid the work entirely.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c247ca1 and 6d4e3a1.

📒 Files selected for processing (2)
  • config.js (1 hunks)
  • src/server/system_services/system_store.js (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build-noobaa-image
  • GitHub Check: run-jest-unit-tests
  • GitHub Check: run-package-lock-validation
🔇 Additional comments (3)
src/server/system_services/system_store.js (3)

354-356: Constructor: keep the config values but fail fast on nonsense input

config.LOAD_FRESH_THRESHOLD and config.READ_NEW_DATA_CONCURRENCY are already integers.
Still, enforce basic sanity ( > 0 ) to avoid later surprises.

-this.LOAD_FRESH_THRESHOLD = config.LOAD_FRESH_THRESHOLD || 100;
-this.READ_NEW_DATA_CONCURRENCY = config.READ_NEW_DATA_CONCURRENCY || 5;
+this.LOAD_FRESH_THRESHOLD = Math.max(1, config.LOAD_FRESH_THRESHOLD);
+this.READ_NEW_DATA_CONCURRENCY = Math.max(1, config.READ_NEW_DATA_CONCURRENCY);

418-419: Good move: instantiate SystemStoreData once per load

Creating new_data outside the semaphore saves redundant allocations in contended scenarios.


520-520:

✅ Verification successful

Verify P.map_with_concurrency exists – otherwise this is a breaking change

P.map_with_concurrency is not part of the standard bluebird or native Promise helpers.
If the local util/promise never exported it, this will throw at runtime.

If not found, either add the helper or fall back to P.map with explicit throttling.


🏁 Script executed:

#!/bin/bash
# Check whether the helper is defined anywhere in the repo
rg --no-heading -n "function .*map_with_concurrency" | head

Length of output: 146


Verified helper implementation: map_with_concurrency is defined in src/util/promise.js (line 52), so this call is safe and does not introduce a breaking change.

config.js Outdated
Comment on lines 253 to 254
config.LOAD_FRESH_THRESHOLD = parseInt(process.env.LOAD_FRESH_THRESHOLD, 10) || 100;
config.READ_NEW_DATA_CONCURRENCY = parseInt(process.env.READ_NEW_DATA_CONCURRENCY, 10) || 5;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Guard against non-positive or NaN values when parsing the new tunables

parseInt() returns NaN for empty/non-numeric input, and users might also pass 0 or negative numbers by mistake.
Relying on the || fallback keeps the default safe, but if a caller explicitly sets READ_NEW_DATA_CONCURRENCY=0 or <0, the code will silently fall back to 5, hiding the mis-configuration. Make the validation explicit and fail early.

-config.LOAD_FRESH_THRESHOLD = parseInt(process.env.LOAD_FRESH_THRESHOLD, 10) || 100;
-config.READ_NEW_DATA_CONCURRENCY = parseInt(process.env.READ_NEW_DATA_CONCURRENCY, 10) || 5;
+const _freshRaw = parseInt(process.env.LOAD_FRESH_THRESHOLD, 10);
+config.LOAD_FRESH_THRESHOLD = Number.isFinite(_freshRaw) && _freshRaw > 0 ? _freshRaw : 100;
+
+const _concurrencyRaw = parseInt(process.env.READ_NEW_DATA_CONCURRENCY, 10);
+config.READ_NEW_DATA_CONCURRENCY =
+    Number.isFinite(_concurrencyRaw) && _concurrencyRaw > 0 ? _concurrencyRaw : 5;

This makes wrong inputs visible instead of being silently ignored.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
config.LOAD_FRESH_THRESHOLD = parseInt(process.env.LOAD_FRESH_THRESHOLD, 10) || 100;
config.READ_NEW_DATA_CONCURRENCY = parseInt(process.env.READ_NEW_DATA_CONCURRENCY, 10) || 5;
const _freshRaw = parseInt(process.env.LOAD_FRESH_THRESHOLD, 10);
config.LOAD_FRESH_THRESHOLD = Number.isFinite(_freshRaw) && _freshRaw > 0 ? _freshRaw : 100;
const _concurrencyRaw = parseInt(process.env.READ_NEW_DATA_CONCURRENCY, 10);
config.READ_NEW_DATA_CONCURRENCY =
Number.isFinite(_concurrencyRaw) && _concurrencyRaw > 0 ? _concurrencyRaw : 5;
🤖 Prompt for AI Agents
In config.js around lines 253 to 255, the code uses parseInt with a fallback
that silently ignores invalid, zero, or negative values for LOAD_FRESH_THRESHOLD
and READ_NEW_DATA_CONCURRENCY. To fix this, add explicit validation after
parsing to check if the values are positive integers; if not, throw an error or
exit early to fail fast and make misconfigurations visible instead of defaulting
silently.

@alphaprinz alphaprinz force-pushed the system_store_load branch 3 times, most recently from 38341fc to 50e4d2c Compare June 19, 2025 05:05
@pull-request-size pull-request-size bot added size/S and removed size/M labels Jun 19, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/server/system_services/system_store.js (1)

415-426: Address the past review concern about fresh-skip logic.

The past review comment identified a critical issue where the fresh-skip logic ignores the since parameter, potentially dropping updates when a caller requests data with an older timestamp but the store is considered "fresh".

The concern raised in the past review about the fresh-skip logic ignoring the since hint is still valid and should be addressed. The current implementation could skip necessary reloads when since < last_update_time but the data is considered fresh according to the threshold.

🧹 Nitpick comments (1)
src/server/system_services/system_store.js (1)

363-364: Review the periodic refresh strategy for potential interference.

The periodic refresh using setInterval ensures data freshness, but consider potential issues:

  1. The refresh runs every 10 minutes regardless of system activity
  2. It might interfere with explicit load operations or cause unnecessary database load during low-activity periods
  3. The arrow function binding might not preserve the correct this context

Consider using a more sophisticated refresh strategy:

-        setInterval(this.refresh, this.START_REFRESH_THRESHOLD);
+        setInterval(() => this.refresh(), this.START_REFRESH_THRESHOLD);

And potentially add activity-based refresh control to avoid unnecessary refreshes during idle periods.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 50e4d2c and 5c2e8fb.

📒 Files selected for processing (4)
  • config.js (1 hunks)
  • src/server/bg_services/cluster_hb.js (1 hunks)
  • src/server/bg_services/md_aggregator.js (3 hunks)
  • src/server/system_services/system_store.js (6 hunks)
✅ Files skipped from review due to trivial changes (1)
  • config.js
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: run-jest-unit-tests
  • GitHub Check: Build Noobaa Image
  • GitHub Check: run-package-lock-validation
  • GitHub Check: Summary
🔇 Additional comments (9)
src/server/bg_services/cluster_hb.js (1)

96-96: LGTM - Performance optimization for heartbeat updates.

Disabling notifications for heartbeat updates reduces unnecessary cluster propagation overhead. This is appropriate since heartbeat data is typically consumed locally and doesn't require immediate cluster-wide synchronization.

src/server/bg_services/md_aggregator.js (3)

71-71: LGTM - Performance optimization for storage stats updates.

Disabling notifications for incremental storage statistics updates is appropriate since these are internal maintenance operations that don't require immediate cluster-wide propagation.


81-81: LGTM - Consistent with batch processing approach.

Disabling notifications for the global last update timestamp aligns with the incremental processing strategy where notifications are handled separately from the update batching.


209-209: Verify notification strategy for time skew reset operations.

While disabling notifications here aligns with the performance optimization, the time skew reset operation resets all storage statistics to NOOBAA_EPOCH. This is a significant system state change that might benefit from immediate cluster propagation to ensure consistency.

Consider whether this critical recovery operation should retain notifications or if the delayed propagation is acceptable for your use case.

src/server/system_services/system_store.js (5)

354-354: LGTM - Configurable concurrency control for database load balancing.

Adding configurable concurrency control helps prevent database overload during system store loading. The default value of 5 seems reasonable for most deployments.


515-515: LGTM - Concurrency control for database collection reads.

Replacing P.map with P.map_with_concurrency using the configured limit prevents overwhelming the database during incremental data loading. This addresses the performance objective of optimizing concurrent database queries.


607-607: LGTM - Flexible notification control for performance optimization.

Adding the publish parameter with a default value of true maintains backward compatibility while allowing callers to disable notifications for performance-sensitive operations. This supports the performance optimization goals.


617-617: LGTM - Conditional notification publishing implementation.

The conditional check properly implements the notification control based on the publish parameter, maintaining the existing behavior when notifications are enabled.


433-439: Enhanced logging provides valuable performance insights.

The detailed logging with conditional debug level checks helps monitor system store performance without impacting production performance when debugging is disabled.

@alphaprinz alphaprinz marked this pull request as draft June 24, 2025 22:28
@alphaprinz alphaprinz marked this pull request as ready for review June 25, 2025 16:43
}
//call refresh periodically
P.delay_unblocking(this.START_REFRESH_THRESHOLD).then(this.refresh);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can cause a lot of refresh() calls to be queued. I believe that refresh() is called on every RPC request as a middleware (see here), so I don't think we should schedule another refresh.

Signed-off-by: Amit Prinz Setter <alphaprinz@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🔭 Outside diff range comments (1)
src/rpc/rpc_schema.js (1)

62-74: Strict RPC schema validation was globally disabled—unexpected fields now pass through

By commenting out the schema_utils.strictify calls in src/rpc/rpc_schema.js (lines 62–74), all RPC‐level schemas revert to AJV’s default of additionalProperties: true. Since most API definitions don’t explicitly set additionalProperties: false, extra or malformed properties will no longer be rejected in RPC calls.

Please restore targeted strictness to prevent data-integrity or security gaps. For example:

  • In src/rpc/rpc_schema.js, reapply:
    class RpcSchema {
        register_api(api) {
            assert(!this[api.$id], …);
            assert(!this._compiled, …);
    +       _.each(api.definitions, schema => {
    +           schema_utils.strictify(schema, { additionalProperties: false });
    +       });
    +       _.each(api.methods, method_api => {
    +           schema_utils.strictify(method_api.params, { additionalProperties: false });
    +           schema_utils.strictify(method_api.reply, { additionalProperties: false });
    +       });
            this._ajv.addSchema(api);
            this[api.$id] = api;
        }
  • Or explicitly add additionalProperties: false in critical method‐level schemas within your API definition files.

[src/rpc/rpc_schema.js: lines 62–74]

♻️ Duplicate comments (1)
src/server/system_services/system_store.js (1)

404-416: Improved method structure but still has refresh queuing concern.

The modification to store and return the result instead of early returns improves code structure. However, the past review comment about refresh() queuing still applies - this method is called on every RPC request as middleware, which could cause performance issues.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 36b9892 and 93bd92a.

📒 Files selected for processing (9)
  • config.js (2 hunks)
  • src/api/system_api.js (1 hunks)
  • src/endpoint/endpoint.js (1 hunks)
  • src/rpc/rpc_schema.js (2 hunks)
  • src/server/bg_services/cluster_hb.js (1 hunks)
  • src/server/bg_services/md_aggregator.js (3 hunks)
  • src/server/system_services/system_server.js (2 hunks)
  • src/server/system_services/system_store.js (10 hunks)
  • src/util/postgres_client.js (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/util/postgres_client.js
🚧 Files skipped from review as they are similar to previous changes (3)
  • config.js
  • src/server/bg_services/md_aggregator.js
  • src/server/bg_services/cluster_hb.js
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/server/system_services/system_server.js (16)
src/server/bg_services/cluster_hb.js (1)
  • system_store (12-12)
src/endpoint/endpoint.js (1)
  • system_store (36-36)
src/server/object_services/object_server.js (1)
  • system_store (31-31)
src/server/notifications/dispatcher.js (1)
  • system_store (13-13)
src/server/system_services/pool_server.js (1)
  • system_store (19-19)
src/test/unit_tests/coretest.js (1)
  • system_store (49-49)
src/server/system_services/bucket_server.js (1)
  • system_store (28-28)
src/hosted_agents/hosted_agents.js (1)
  • system_store (10-10)
src/sdk/map_client.js (1)
  • system_store (22-22)
src/test/unit_tests/test_encryption.js (1)
  • system_store (9-9)
src/server/system_services/cluster_server.js (1)
  • system_store (24-24)
src/test/unit_tests/test_map_builder.js (1)
  • system_store (26-26)
src/server/node_services/nodes_aggregator.js (1)
  • system_store (9-9)
src/server/bg_services/agent_blocks_reclaimer.js (1)
  • system_store (9-9)
src/server/object_services/map_builder.js (1)
  • system_store (16-16)
src/server/bg_services/key_rotator.js (1)
  • system_store (7-7)
src/endpoint/endpoint.js (2)
src/server/system_services/system_server.js (2)
  • system_store (32-32)
  • system_store (1196-1196)
src/server/web_server.js (2)
  • system_store (29-29)
  • system_store (273-273)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: run-package-lock-validation
  • GitHub Check: Build Noobaa Image
  • GitHub Check: run-jest-unit-tests
🔇 Additional comments (11)
src/server/system_services/system_server.js (2)

306-308: LGTM: Clean implementation of new system store API method

The async function correctly calls system_store.recent_db_data() and follows the established patterns in the file. This aligns with the PR objective of optimizing system store performance by providing access to recent in-memory data.


1607-1607: LGTM: Proper export of new function

The new function is correctly exported following the existing export pattern in the file.

src/endpoint/endpoint.js (1)

174-174: ✔️ Verified and Approved: system_store.SOURCE.CORE Defined and Exported

I’ve confirmed that SOURCE.CORE is referenced in the constructor and that exports.SOURCE = SOURCE correctly exposes the enum. The change in src/endpoint/endpoint.js to explicitly load from SOURCE.CORE under LOCAL_MD_SERVER === 'true' aligns with the dual‐source design and requires no further action.

– No issues found; ready to merge.

src/server/system_services/system_store.js (8)

41-41: LGTM: Import addition is appropriate.

The import of decode_json from postgres_client is correctly placed and will be used for decoding JSON fields in the new core source loading functionality.


156-159: LGTM: Well-structured source enum.

The SOURCE enum is properly frozen and provides clear constants for the dual-source architecture. The naming is clear and follows JavaScript conventions.


477-481: LGTM: New method provides clean access to recent data.

The recent_db_data() method correctly uses the load semaphore to ensure thread-safe access to the latest in-memory data. This is a clean implementation for the new API endpoint.


535-535: Good: Concurrency control implementation.

The addition of configurable concurrency control using P.map_with_concurrency is a solid performance improvement. The default of 5 concurrent operations is reasonable and the configuration via environment variable provides flexibility.


548-567: Review the core data loading implementation.

The new _read_new_data_from_core() method implementation looks solid overall, but has a few areas for improvement:

  1. Good JSON decoding: Proper use of decode_json with schema validation
  2. Proper validation: Calls to db_client.instance().validate() ensure data integrity
  3. Robust error handling: Good pattern for handling unknown collections

Consider adding error handling for the RPC call:

    async _read_new_data_from_core(target) {
        dbg.log0("_read_new_data_from_core");
-       const res = await server_rpc.client.system.get_system_store();
+       try {
+           const res = await server_rpc.client.system.get_system_store();
+           // ... rest of the method
+       } catch (err) {
+           dbg.error('Failed to load system store from core:', err);
+           throw err;
+       }
    }

661-669: Good: Conditional cluster notification implementation.

The conditional publishing based on the publish parameter is well-implemented and allows for controlled cluster notifications. The commented condition check /*if (publish)*/ suggests this was intentionally left as always-publish for now, which is a safe approach.


897-897: LGTM: SOURCE export is appropriate.

The export of the SOURCE enum allows other modules to use the same constants, promoting consistency across the codebase.


648-648: Signature change is backward-compatible

I’ve reviewed all 200+ invocations of make_changes(...). Every existing call passes only the changes object and relies on the new publish = true default. The sole explicit override in src/server/bg_services/md_aggregator.js already sets publish = false, as intended. No further updates are required.

Comment on lines +464 to 483
get_system_store: {
method: 'GET',
/*params: {
type: 'object'
},*/
reply: {
type: 'object',
/*properites: {
systems: {
type: 'array',
items: {
$ref: '#/definitions/role_info'
}
}
}*/
},
auth: {
system: false
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Complete the API schema definitions and verify access permissions.

The new API method has several issues that need to be addressed:

  1. Commented-out schema definitions: The params and reply.properties are commented out, which is not suitable for production code. This leaves the API without proper validation.

  2. Broad access permissions: The auth: { system: false } setting allows broader access than most other endpoints in this file. Verify this is intentional for security.

  3. Missing documentation: The method lacks a doc field to describe its purpose.

        get_system_store: {
+           doc: 'Get system store data',
            method: 'GET',
-           /*params: {
-               type: 'object'
-           },*/
            reply: {
                type: 'object',
-               /*properites: {
-                   systems: {
-                       type: 'array',
-                       items: {
-                           $ref: '#/definitions/role_info'
-                       }
-                   }
-               }*/
+               // TODO: Define proper schema for system store data
            },
            auth: {
-               system: false
+               system: false  // TODO: Verify this access level is appropriate
            }
        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
get_system_store: {
method: 'GET',
/*params: {
type: 'object'
},*/
reply: {
type: 'object',
/*properites: {
systems: {
type: 'array',
items: {
$ref: '#/definitions/role_info'
}
}
}*/
},
auth: {
system: false
}
}
get_system_store: {
doc: 'Get system store data',
method: 'GET',
reply: {
type: 'object',
// TODO: Define proper schema for system store data
},
auth: {
system: false // TODO: Verify this access level is appropriate
}
}
🤖 Prompt for AI Agents
In src/api/system_api.js around lines 464 to 483, uncomment and complete the
params and reply.properties schema definitions to ensure proper validation of
input and output data. Confirm that the auth setting with system: false is
intentional for this endpoint's access control; if not, adjust it to match the
security level of similar endpoints. Finally, add a doc field describing the
purpose and behavior of the get_system_store method for better documentation and
maintainability.

Comment on lines +436 to +441
if (this.source === SOURCE.DB) {
await this._read_new_data_from_db(new_data);
} else {
this.data = new SystemStoreData();
await this._read_new_data_from_core(this.data);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Verify the data initialization logic for CORE source.

The conditional loading logic has a potential issue: for CORE source, this.data is reassigned to a new SystemStoreData() instance, but the original new_data instance is discarded. This could cause inconsistencies.

                if (this.source === SOURCE.DB) {
                    await this._read_new_data_from_db(new_data);
                } else {
-                   this.data = new SystemStoreData();
-                   await this._read_new_data_from_core(this.data);
+                   await this._read_new_data_from_core(new_data);
                }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (this.source === SOURCE.DB) {
await this._read_new_data_from_db(new_data);
} else {
this.data = new SystemStoreData();
await this._read_new_data_from_core(this.data);
}
if (this.source === SOURCE.DB) {
await this._read_new_data_from_db(new_data);
} else {
await this._read_new_data_from_core(new_data);
}
🤖 Prompt for AI Agents
In src/server/system_services/system_store.js around lines 436 to 441, the code
reassigns this.data to a new SystemStoreData() instance when the source is CORE,
discarding the original new_data instance. To fix this, avoid reassigning
this.data and instead use the existing new_data instance for CORE source,
ensuring consistent data handling across sources.

Comment on lines 359 to 361
this.SYSTEM_STORE_LOAD_CONCURRENCY = config.SYSTEM_STORE_LOAD_CONCURRENCY || 5;
this.source = process.env.HOSTNAME.indexOf("endpoint") === -1 ? SOURCE.DB : SOURCE.CORE;
dbg.log0("system store source is", this.source);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Source determination logic is brittle and could fail.

The hostname-based source determination uses a simple string indexOf check which could be unreliable:

  1. Brittle hostname matching: The check process.env.HOSTNAME.indexOf("endpoint") === -1 assumes hostnames follow a specific naming pattern
  2. Potential runtime errors: If HOSTNAME environment variable is undefined, this will throw an error
  3. Hard-coded string matching: The "endpoint" string is hard-coded without constants
-        this.source = process.env.HOSTNAME.indexOf("endpoint") === -1 ? SOURCE.DB : SOURCE.CORE;
+        const hostname = process.env.HOSTNAME || '';
+        this.source = hostname.includes("endpoint") ? SOURCE.CORE : SOURCE.DB;

Consider using a more robust configuration approach:

+        // Use explicit config or environment variable for source selection
+        this.source = process.env.SYSTEM_STORE_SOURCE || 
+                     (hostname.includes("endpoint") ? SOURCE.CORE : SOURCE.DB);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
this.SYSTEM_STORE_LOAD_CONCURRENCY = config.SYSTEM_STORE_LOAD_CONCURRENCY || 5;
this.source = process.env.HOSTNAME.indexOf("endpoint") === -1 ? SOURCE.DB : SOURCE.CORE;
dbg.log0("system store source is", this.source);
this.SYSTEM_STORE_LOAD_CONCURRENCY = config.SYSTEM_STORE_LOAD_CONCURRENCY || 5;
const hostname = process.env.HOSTNAME || '';
this.source = hostname.includes("endpoint") ? SOURCE.CORE : SOURCE.DB;
// Use explicit config or environment variable for source selection
this.source = process.env.SYSTEM_STORE_SOURCE ||
(hostname.includes("endpoint") ? SOURCE.CORE : SOURCE.DB);
dbg.log0("system store source is", this.source);
🤖 Prompt for AI Agents
In src/server/system_services/system_store.js around lines 359 to 361, the
source determination logic relies on a brittle hostname check using
process.env.HOSTNAME.indexOf("endpoint") which can throw errors if HOSTNAME is
undefined and uses a hard-coded string. Fix this by first safely checking if
process.env.HOSTNAME is defined before using indexOf, replace the hard-coded
"endpoint" string with a constant, and consider using a more robust
configuration method such as an explicit environment variable or config flag to
determine the source instead of relying on hostname pattern matching.

@alphaprinz alphaprinz marked this pull request as draft July 9, 2025 18:01
Signed-off-by: Amit Prinz Setter <alphaprinz@gmail.com>
@alphaprinz alphaprinz force-pushed the system_store_load branch from 93bd92a to 1b2b439 Compare July 9, 2025 18:21
Comment on lines +239 to +240
config.POSTGRES_DEFAULT_MAX_CLIENTS = 5;
config.POSTGRES_MD_MAX_CLIENTS = (process.env.LOCAL_MD_SERVER === 'true') ? 70 : 5;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you changing this?

},*/
reply: {
type: 'object',
/*properites: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should myabe use RPC_BUFFERS here. see example in block_store_api

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants